Dynamic packet training

ABSTRACT

A packet control mechanism for a computer data system that dynamically adjusts packet training depending on the utilization load on the processor. The dynamic adjustment of packet training can be to enable and disable packet training, or adjust the number of packets in the packet train. In preferred embodiments, the computer data system includes a processor utilization mechanism that indicates a load on a processor. When the packet control mechanism determines the load on the processor is above a threshold limit, the packet control mechanism reduces the processor load by compressing the packets into the packet train. The compressing of the packets is stopped or reduced when the processor load is below a threshold in order to increase the data throughput on the network interface.

CROSS-REFERENCE TO PARENT APPLICATION

This patent application is a continuation of “U.S. Ser. No. 11/106,011filed on Apr. 14, 2005, which is incorporated herein by reference.

BACKGROUND OF THE INVENTION

1. Technical Field

This invention generally relates to data processing and communications,and more specifically relates to dynamically transmitting data packetsin a packet train on a computer network or computer communication link.

2. Background Art

Computer systems communicate with each other over computer networks.Such networks include multiple nodes, which are typically computers,that may be distributed over vast distances and connected bycommunications links. Nodes in the computer network communicate witheach other using data packets sent over the communication links. Thedata packets are the basic units of information transfer. A data packetcontains data surrounded by control and routing information supplied bythe various nodes.

Sending, receiving, and processing of packets have an overhead, orassociated cost. That is, it takes time for the central processing unit(CPU) at a node to receive a packet, to examine the packet's controlinformation, and to determine the next action. One way to reduce thepacket overhead is a method called packet training. Packet trainingconsolidates individual packets into a group, called a train, so that anode can process the entire train of packets at once. The term “train”is in reference to a train of railroad cars. The packets are formed intoa group of sequential packets like a line of railroad cars or a train.Processing a train of packets has less overhead, and thus betterperformance, than processing each packet individually.

In a typical training method, a node will accumulate packets until thetrain reaches a fixed target-length. Then the node will process orretransmit the entire packet train at once. In order to ensure that theaccumulated packets are eventually handled since the packet arrival rateat the node is unpredictable, the method will start a timer when thenode receives the train's first packet. When the timer expires, the nodewill end the train and process it even if train has not reached itstarget length. This training method works well in times of heavypacket-traffic because the timer never expires. But in times of lightpacket-traffic, the packets that the node accumulates experience poorperformance while waiting in vain for additional packets to arrive, andthe ultimate timer expiration introduces additional processing overhead.

In another prior art packet training method, described in U.S. Pat. No.5,859,853 to David Glen Carlson and incorporated herein by reference,the system dynamically adjusts the number of packets sent in a trainfrom a node to reflect the rate-of-packets arriving at a node in anetwork. A packet controller determines the optimum train-length, thatis the optimum number-of-packets to send in a train. The node also has atimer interval, which is the maximum time-to-wait before sending thenext train. The packet controller samples the packet arrival-rate andcalculates the elapsed time to receive a number-of-packets in a train.This elapsed time is referred to as a sampling interval. The packetcontroller calibrates the optimum train-length when the samplinginterval changes significantly from the historic sampling-interval. Thismethod provides dynamic training of packets but does not efficientlyhandle message latency, particularly for burst mode communicationtraffic in a low CPU utilization environment.

Packet training can save a significant amount of CPU load in a heavycommunications workload environment. However, packet training can have adetrimental affect on the latency of messages sent over the network.When a message is sent with packet training, the message may be delayedwhile a packet train is being assembled. Thus there is a tradeoffbetween CPU load and communication latency when using packet training.Packet training decreases the load on the CPU but may increase the timefor a message to be sent over the network due to the delay in building atrain of packets. Without a way to optimize the tradeoff between CPUloading and network latency, the computer industry will continue tosuffer from sub-optimum performance from a packet data network.

DISCLOSURE OF INVENTION

According to the preferred embodiments, a computer data system includesa packet control mechanism that dynamically adjusts packet trainingdepending on the utilization load on the processor. The dynamicadjustment of packet training can be to enable and disable packettraining, or adjust the number of packets in the packet train. Inpreferred embodiments, the computer data system includes a processorutilization mechanism that indicates a load on a processor. When thepacket control mechanism determines the load on the processor is above athreshold limit, the packet control mechanism reduces the processor loadby processing the packets into a packet train. The training of thepackets is stopped or reduced when the processor load is below athreshold in order to increase the data throughput on the networkinterface.

The foregoing and other features and advantages of the invention will beapparent from the following more particular description of preferredembodiments of the invention, as illustrated in the accompanyingdrawings.

BRIEF DESCRIPTION OF DRAWINGS

The preferred embodiments of the present invention will hereinafter bedescribed in conjunction with the appended drawings, where likedesignations denote like elements, and:

FIG. 1 is a block diagram of a computer system according a preferredembodiment;

FIG. 2 is a more detailed block diagram of the computer system in FIG.1;

FIG. 3 depicts a data structure of an example packet, in accordance withthe prior art;

FIG. 4 depicts a data structure of an example packet train, inaccordance with the prior art;

FIG. 5 illustrates a method in accordance with a preferred embodiment;and

FIG. 6 illustrates a method in accordance with another preferredembodiment.

BEST MODE FOR CARRYING OUT THE INVENTION

The present invention relates to dynamic packet training in a datapacket network depending on the loading of the CPU. The Overview Sectionimmediately below is intended to provide an introductory explanation ofpack training operations and history for individuals who need additionalbackground in this area. Those who are skilled in the art may wish toskip this section and begin with the Detailed Description sectioninstead.

Overview

Computer networks typically have multiple nodes connected bycommunications links, such as telephone networks. Each node typicallyincludes a processing element, which processes data, and acommunications-control unit, which controls the transmission andreception of data in the network across the communications link. Theprocessing element can include one or more processors and memory.

Nodes communicate with each other using packets, which are the basicunits of information transfer. A packet contains data surrounded bycontrol and routing information supplied by the various nodes in thenetwork. A message from one node to another may be sent via a singlepacket, or the node can break the message up into several shorterpackets with each packet containing a portion of the message. Thecommunications-control unit at a node receives a packet from thecommunications link and sends the packet to the node's processingelement for processing. Likewise, a node's processing element sends apacket to the node's communications-control unit, which transmits thepacket across the network.

Referring to FIG. 3, the data structure for a typical packet 300 isdepicted, which includes header section 302 and data section 304. Headersection 302 contains control information that encapsulates data 304. Forexample, header section 302 might contain protocol, session, source, ordestination information used for routing packet 300 over network 170(FIG. 1). Data section 304 could contain electronic mail, files,documents, or any other information desired to be communicated overnetwork 170. Data section 304 could also contain another entire packet,including header and data sections. Processing of packets has anoverhead, or cost, associated with it. That is, it takes time to receivea packet at a node, to examine the packet's control information, and todetermine what to do next with the packet. One way to reduce the packetoverhead is to use a method called packet-training. Packet-trainingconsolidates individual packets into a group, called a train, whichreduces the overhead when compared to processing the same number ofpackets individually because a node can process the entire train ofpackets at once.

Referring to FIG. 4, a data structure example of a packet train 400,represents both the prior art and the packet train structure used by thepreferred embodiments. Packet train 400 contains control information402, the number of packets 404, a number of lengths 406 (length 406 a,length 406 b and so forth to length 406 c), and a number of packets 408(packet 408 a, packet 408 b and so forth to packet 408 c). Controlinformation 402 can specify, among other things, that the informationthat follows is part of a packet train. Number of packets 404 indicateshow many packets are in the train. In this example, there are “n”packets in the train. Length 1 to length n are the lengths of packet 1to packet n, respectively. Each packet 408 a to packet 408 c can containheader and data, as shown in FIG. 3. Packet train 400 is transferredbetween nodes as one unit.

DETAILED DESCRIPTION

Preferred embodiments illustrate a computer data system that dynamicallyadjusts packet training for network communication traffic on a networknode depending on the processor loading. The network could have computersystems as its nodes, or the network could have processors in amulti-processor system as its nodes, or the network could be acombination of processors and computer systems. In the preferredembodiment, a node has a packet controller that dynamically enables anddisables packet training. A suitable computer system is described below.

Referring to FIG. 1, a computer system 100 is shown in accordance withthe preferred embodiments of the invention. Computer system 100 is anIBM eServer iSeries computer system. However, those skilled in the artwill appreciate that the mechanisms and apparatus of the presentinvention apply equally to any computer system, regardless of whetherthe computer system is a complicated multi-user computing apparatus, asingle user workstation, or an embedded control system. As shown in FIG.1, computer system 100 comprises a processor (central processing unit orCPU) 110, a main memory 120, a mass storage interface 130, a displayinterface 140, and a network interface 150. These system components areinterconnected through the use of a system bus 160. Mass storageinterface 130 is used to connect mass storage devices, such as a directaccess storage device 155, to computer system 100. One specific type ofdirect access storage device 155 is a readable and writable CD RW drive,which may store data to and read data from a CD RW 195.

Processor 110 may be constructed from one or more microprocessors and/orintegrated circuits. Processor 110 executes program instructions storedin main memory 120. Main memory 120 stores programs and data thatprocessor 110 may access. When computer system 100 starts up, processor110 initially executes the program instructions that make up operatingsystem 122. Operating system 122 is a sophisticated program that managesthe resources of computer system 100. Some of these resources areprocessor 110, main memory 120, mass storage interface 130, displayinterface 140, network interface 150, and system bus 160.

Although computer system 100 is shown to contain only a single processorand a single system bus, those skilled in the art will appreciate thatthe present invention may be practiced using a computer system that hasmultiple processors and/or multiple buses. In addition, the interfacesthat are used in the preferred embodiment each include separate, fullyprogrammed microprocessors that are used to off-load compute-intensiveprocessing from processor 110. However, those skilled in the art willappreciate that the present invention applies equally to computersystems that simply use I/O adapters to perform similar functions.

Display interface 140 is used to directly connect one or more displays165 to computer system 100. These displays 165, which may benon-intelligent (i.e., dumb) terminals or fully programmableworkstations, are used to allow system administrators and users tocommunicate with computer system 100. Note, however, that while displayinterface 140 is provided to support communication with one or moredisplays 165, computer system 100 does not necessarily require a display165, because all needed interaction with users and other processes mayoccur via network interface 150.

Network interface 150 is used to connect other computer systems and/orworkstations (e.g., 175 in FIG. 1) to computer system 100 across anetwork 170. The present invention applies equally no matter howcomputer system 100 may be connected to other computer systems and/orworkstations, regardless of whether the network connection 170 is madeusing present-day analog and/or digital techniques or via somenetworking mechanism of the future. In addition, many different networkprotocols can be used to implement a network. These protocols arespecialized computer programs that allow computers to communicate acrossnetwork 170. TCP/IP (Transmission Control Protocol/Internet Protocol) isan example of a suitable network protocol.

Main memory 120 in accordance with the preferred embodiments containsdata 121, an operating system 122, an application 123 and a packetcontroller 124. Data 121 represents any data that serves as input to oroutput from any program in computer system 100. Operating system 122 isa multitasking operating system known in the industry as OS/400;however, those skilled in the art will appreciate that the spirit andscope of the present invention is not limited to any one operatingsystem. The application 123 is any application software programoperating in the system that processes data 121. The packet controller124 operates in conjunction with the communications controller 152 inthe network interface 150 to dynamically adjust the packet compressionas described further below. Packet controller 124 includes one or morethresholds 125 for comparing to the utilization level of the processor,and one or more maximum train sizes 126 for setting the maximum numberof packets in a packet train. The thresholds 125 and maximum train sizes126 are described further below.

Computer system 100 utilizes well known virtual addressing mechanismsthat allow the programs of computer system 100 to behave as if they onlyhave access to a large, single storage entity instead of access tomultiple, smaller storage entities such as main memory 120 and DASDdevice 155. Therefore, while data 121, operating system 122, application123, and the packet controller 124 are shown to reside in main memory120, those skilled in the art will recognize that these items are notnecessarily all completely contained in main memory 120 at the sametime. It should also be noted that the term “memory” is used herein togenerically refer to the entire virtual memory of computer system 100,and may include the virtual memory of other computer systems coupled tocomputer system 100. Thus, while in FIG. 1, the application 123, and thepacket controller 124 are all shown to reside in the main memory 120 ofcomputer system 100, in actual implementation these software componentsmay reside in separate machines and communicate over network 170.

At this point, it is important to note that while the present inventionhas been and will continue to be described in the context of a fullyfunctional computer system, those skilled in the art will appreciatethat the present invention is capable of being distributed as a programproduct in a variety of forms, and that the present invention appliesequally regardless of the particular type of computer-readable signalbearing media used to actually carry out the distribution. Examples ofsuitable computer-readable signal bearing media include: recordable typemedia such as floppy disks and CD RW (e.g., 195 of FIG. 1), andtransmission type media such as digital and analog communications links.

Network 170 may include a plurality of networks, such as local areanetworks, each of which includes a plurality of individual computerssuch as the computer 100 described above. Further the computers may beimplemented utilizing any suitable computer, such as the PS/2 computer,AS/400 computer, or a RISC System/6000 computer, which are products ofIBM Corporation located in Armonk, N.Y. “PS/2”, “AS/400”, and “RISCSystem/6000”are trademarks of IBM Corporation. A plurality ofintelligent work stations (IWS) (not shown) coupled to a processor mayalso be utilized in such a network. Network 170 may also may includemainframe computers, which may be coupled to network 170 by means of asuitable communications link. A mainframe computer may be implemented byutilizing an ESA/370 computer, an ESA/390 computer, or an AS/400computer available from IBM Corporation. “ESA/370”, “ESA/390”, and“AS/400” are trademarks of IBM Corporation.

Referring to FIG. 2, a more detailed schematic representation ofcomputer system 100 is shown, which may be used for training packetsaccording to preferred embodiments. Computer system 100 could beimplemented in any of the computers on the network 170 as describedabove, or in a gateway server or mainframe computer. Computer system 100can contain both hardware and software to implement the packet controlfeatures described herein.

Computer system 100 contains communications controller 152 connected toprocessor 110 and main memory 120 via system bus 160. Computer system100 includes a processor utilization mechanism 112 capable ofdetermining the level of utilization of the processor. Processorutilization mechanism 112 can be implemented in hardware or software. Ina preferred embodiment, processor utilization mechanism 112 isimplemented as an API call to the operating system that is supported byhardware in the processor that determines the ratio of the run cycles tothe total number of cycles. The utilization mechanism could use anymanner of processor metric to determine processor utilization orprocessor loading such as wait state tasks divided by total cycles, orother suitable metric.

Main memory 120 contains packet controller 124, which containsinstructions capable of being executed by processor 110. In thealternative, packet controller 124 could be implemented by controlcircuitry through the use of logic gates, programmable logic devices, orother hardware components in lieu of a processor-based system. Packetcontroller 124 performs the packet-training method described hereinbelow. Packet controller 124 includes one or more thresholds 125 forcomparing to the utilization level of the processor. The thresholds arepreferably selectable by the user or system programmer with anappropriate interface and stored in a memory area of the packetcontroller 124. For example, the thresholds may be set as part of theprocess to change TCP attributes with an appropriate request to theoperating system 122 (FIG. 1).

In preferred embodiments, the packet controller 124 also includes one ormore maximum train sizes 126 for setting the maximum number of packetsin a packet train. Table 1 below shows an illustrative example ofthresholds and associated maximum train size 126, which specifies themaximum number of packets in a packet train. For a threshold of 30%utilization, a maximum train size of 0 is set, indicating that packettraining is disabled. For a threshold of 50% utilization, a maximumtrain size of 50 is set (a moderate size of packet train). For athreshold of 90% utilization, a maximum train size of 100 is set (alarge size of packet train or the maximum sized packet train). Themaximum train size is the size is the number of packets that areaccumulated before sending the packet train. The maximum train size andthe invention herein can also be combined with the prior art method of atimer to send out a packet train after a selected amount of time. Thelisted thresholds and associated packet train sizes are for illustrationonly. Any suitable number of thresholds could be used with an associatedpacket train size to get a desired performance tradeoff.

TABLE 1 Threshold 30 50  90 Max Train Size  0 50 100 (or maximum)

Referring again to FIG. 2, communications controller 152 containscommunications front-end 204, communications packet-controller 206,packet storage 208, and DMA (Direct Memory Access) controller 214, allconnected via communications bus 212. DMA controller 214 is connected toDMA processor 210. Communications front-end 204 is connected to network170, contains the circuitry for transmitting and receiving packetsacross network 170, and is employed to communicate with other nodescoupled to network 170.

When a packet is received by communications front end 204 from network170, the packet is examined by communications packet-controller 206 andstored in packet storage 208 before being sent to DMA processor 210. DMAprocessor 210 controls DMA controller 214. DMA controller 214 receivespackets from communications bus 212 and sends the packets to processor110 through system bus 160. The packets then are processed by packetcontroller 124 and stored in host memory 120. When host processor 110desires to send packets to network 170, it transmits the packets fromhost memory 120 to packet storage 208 using DMA controller 214 and DMAprocessor 210. Communications packet controller 206 then usescommunications front-end 204 to transmit the packets from packet storage208 across communications link 212 to network 170.

Although a specific hardware configuration is shown in FIG. 2, apreferred embodiment of the present invention can apply to any hardwareconfiguration that allows the training of packets, regardless of whetherthe hardware configuration is a complicated, multi-user computingapparatus, a single-user work station, or a network appliance that doesnot have non-volatile storage of its own.

FIG. 5 shows a method 500 of adjusting the packet compression or packettraining according to a preferred embodiment. The method 500 startsperiodically to check the processor utilization (step 510). The methodmay be started by a timer interrupt or some other suitable means toinsure the method runs with a suitable period. If the processorutilization is greater than a set threshold (step 540=yes), then thepacket training is enabled (step 530). If the processor utilization isless than or equal to a set threshold (step 520=no) then the packettraining is disabled (step 540). The threshold utilization percentagecan be a parameter stored in memory that can be adjusted by a suitablesoftware interface to the packet control mechanism 124 (FIG. 1).

FIG. 6 shows another method 600 of adjusting the packet compression orpacket training according to a preferred embodiment. The method 600starts periodically to check the processor utilization (step 610). Themethod may be started as described above. If the processor utilizationis less than a first set threshold (step 620=yes), then the packettraining is disabled (step 630). If the processor utilization is greaterthan or equal to the first set threshold (step 620=no) then the methodcontinues with step 640. If the processor utilization is less than asecond set threshold (step 640=yes) then the maximum packet training isset to a first level (step 650). If the processor utilization is greaterthan or equal to the first set threshold (step 640=no) then the maximumpacket training is set to a second level (step 660). The thresholdutilization can be a parameter stored in memory that can be adjusted bya suitable software interface to the packet control mechanism 124 (FIG.1). Similarly, other embodiments could include additional thresholds andcorresponding maximum levels of packet training.

As described above, there is a tradeoff between CPU load andcommunication latency when using packet training. Packet trainingdecreases the load on the CPU but may increase the delay for a messageto be sent over the network due to the delay in building a train ofpackets. The present invention provides the computer industry with animproved way to optimize the tradeoff between CPU loading and networklatency to improve overall performance in a packet data network.

One skilled in the art will appreciate that many variations are possiblewithin the scope of the present invention. Thus, while the invention hasbeen particularly shown and described with reference to preferredembodiments thereof, it will be understood by those skilled in the artthat these and other changes in form and details may be made thereinwithout departing from the spirit and scope of the invention.

1) A computer implemented method for packet training comprising thesteps of: determining load on a processor; sending a plurality of datapackets in a packet train over a network; and dynamically adjustingpacket training on the network depending on the load on the processor toenable packet training when the load on the processor is above athreshold and disable packet training when the load on the processor isbelow the threshold. 2) The method of claim 1 further comprising thestep of enabling and disabling packet training depending on the load onthe processor. 3) The method of claim 1 further comprising the step ofadjusting size of the packet train depending on the load on theprocessor. 4) The method of claim 3 comprising the step of comparing theload on the processor to a plurality of predetermined thresholds andsetting a maximum size of a packet train depending on a correspondingpredetermined threshold. 5) The method of claim 4 wherein thepredetermined threshold is set by a user of the computer data system. 6)The method of claim 1 wherein the load on the processor is determinedusing an API call.